This report is prepared with the following environmental settings.
print(R.version)
## _
## platform x86_64-apple-darwin17.0
## arch x86_64
## os darwin17.0
## system x86_64, darwin17.0
## status
## major 4
## minor 1.1
## year 2021
## month 08
## day 10
## svn rev 80725
## language R
## version.string R version 4.1.1 (2021-08-10)
## nickname Kick Things
For me,I am a totally beginner for philosophy and philosophors,especially considering the fact that I am an international student.Although I know about the famous philosophors like plato and aristotle,yet it is important to explore their philosophy from the perspective of schools.
#what are the schools in this dataset
ps.schools = unlist(unique(ps.data$school))
#what are the titles in this dataset
ps.titles =unlist(unique(ps.data$title))
ps.school.split = split(ps.data,ps.data$school)
ps.school.table = as.data.frame(table(ps.data$school))
colnames(ps.school.table)=c("school","count")
ps.school.table = ps.school.table[order(ps.school.table$count,decreasing = TRUE),]
barplot(ps.school.table$count, las = 2, names.arg = ps.school.table$school,
col ="lightblue", main ="Most frequent schools",
ylab = "Sentence Count")
Firstly,we can see how philosophers think about the world by analyzing the frequency of certain words appearing in the sentences。
#Since the data has been lower-cased and tokenized,we can use the tokenized texts and remove all
ps.corpus = Corpus(VectorSource(ps.data$tokenized_txt))
ps.corpus = tm_map(ps.corpus, removeNumbers)
ps.corpus = tm_map(ps.corpus, removePunctuation)
ps.corpus = tm_map(ps.corpus, removeWords, c("the", "and","also","two", "one","can","may","will","must","might","just","thus","therefore",stopwords("english")))
ps.corpus = tm_map(ps.corpus, stripWhitespace)
ps.tdm.all<-TermDocumentMatrix(ps.corpus)
ps.tdm.tidy=tidy(ps.tdm.all)
ps.tdm.overall=summarise(group_by(ps.tdm.tidy, term), sum(count))
#word frequency
ps.wordfreq <- data.frame(word =ps.tdm.overall$term ,freq=ps.tdm.overall$`sum(count)`)
ps.wordfreq <- ps.wordfreq[order(ps.wordfreq$freq,decreasing = TRUE),]
head(ps.wordfreq,20)
## word freq
## 79542 things 17482
## 47112 man 16152
## 80145 time 15307
## 26632 even 14461
## 29358 first 13880
## 53277 now 13041
## 69636 say 12856
## 87419 way 12854
## 51420 nature 12796
## 88698 world 12088
## 65068 reason 11464
## 88406 without 11432
## 79530 thing 11275
## 73614 something 10841
## 3537 another 10466
## 45095 like 10265
## 32924 good 10214
## 72488 since 10011
## 57335 part 9838
## 11089 case 9596
barplot(ps.wordfreq[1:20,]$freq, las = 2, names.arg = ps.wordfreq[1:20,]$word,
col ="lightblue", main ="Most frequent words",
ylab = "Word frequencies")
#wordcloud in general
wordcloud(ps.tdm.overall$term, ps.tdm.overall$`sum(count)`,
scale=c(3,0.5),
max.words=100,
min.freq=100,
random.order=FALSE,
rot.per=0.3,
use.r.layout=T,
random.color=FALSE,
colors=brewer.pal(8,"Dark2"))
We can see various philosophy topics in the high frequency words: “things”and “man”,“world”、“nature”、“object”、“subject”and so on; As we would like to identify interesting words for each sentence, we use [TF-IDF]to weigh each term within each sentence. It highlights terms that are more specific for a particular sentence.
# compute TF-IDF weighted document-term matrices for individual sentences
ps.dtm <- DocumentTermMatrix(ps.corpus,
control = list(weighting = weightTfIdf,stopwords = TRUE))
#ps.dtm=tidy(dtm)
ps.dtm = removeSparseTerms(ps.dtm , 0.99)
freq = data.frame(sort(colSums(as.matrix(ps.dtm)), decreasing=TRUE))
wordcloud(rownames(freq), freq[,1], max.words=100, colors=brewer.pal(5, "Dark2"))
For each sentence,the “nature”“existence”“object” outstands,we can see that philosophers focus a lot on conceptual and abstract topics,which make them think about the essence of world and man.
creat.wordcloud(ps.school.split[1],ps.data,names(ps.school.split[1]))
For school analytic:“theory”,“sense”,“truth”,“sense”,“fact” are the keywords
For school aristotle:“will”,“must”,“reason”,“nature”,“animal” are the keywords
For school capitalism: Price,money ,capital are the main topics
For school communism: labour,value, production,capital are the main topics
For school continental:“madness”,“form”,“relationship” appear the most
For school empiricism: “ideas/idea”,“reason” appear frequently
For school feminism:without doubt,“women/woman”appear most,“man”,“love”,“children” appear frequently
For school german_idealism:quite abstract and conceptual words like “consciousness”,“self” and “law”
For school nietzsche:“life”“virtue”“god”“love”“soul”
For school phenomenology:“knowledge”“experience”“conciousness”
For school plato:it is interesting “socrates” appears a lot of time
For school rationalism:Why “god” appear most here?Maybe it is always used as counterexamples
For school stoicism:you can guess this school is from old greek period from the words“thou”“thyself”“thee”
Philosophy focus mainly on conceptual and abstract topics
Different school of philosophers have their own topic